153 research outputs found

    Common Representation Learning Using Step-based Correlation Multi-Modal CNN

    Full text link
    Deep learning techniques have been successfully used in learning a common representation for multi-view data, wherein the different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of canonical correlation-based approaches and autoencoder based approaches. In this paper, we investigate the performance of deep autoencoder based methods on multi-view data. We propose a novel step-based correlation multi-modal CNN (CorrMCNN) which reconstructs one view of the data given the other while increasing the interaction between the representations at each hidden layer or every intermediate step. Finally, we evaluate the performance of the proposed model on two benchmark datasets - MNIST and XRMB. Through extensive experiments, we find that the proposed model achieves better performance than the current state-of-the-art techniques on joint common representation learning and transfer learning tasks.Comment: Accepted in Asian Conference of Pattern Recognition (ACPR-2017

    Low-order coupled map lattices for estimation of wake patterns behind vibrating flexible cables

    Get PDF
    Fluid-structure interaction arises in a wide array of technological applications including naval and marine hydrodynamics, civil and wind engineering and flight vehicle aerodynamics. When a fluid flows over a bluff body such as a circular cylinder, the periodic vortex shedding in the wake causes fluctuating lift and drag forces on the body. This phenomenon can lead to fatigue damage of the structure due to large amplitude vibration. It is widely believed that the wake structures behind the structure determine the hydrodynamic forces acting on the structure and control of wake structures can lead to vibration control of the structure. Modeling this complex non-linear interaction requires coupling of the dynamics of the fluid and the structure. In this thesis, however, the vibration of the flexible cylinder is prescribed, and the focus is on modeling the fluid dynamics in its wake. Low-dimensional iterative circle maps have been found to predict the universal dynamics of a two-oscillator system such as the rigid cylinder wake. Coupled map lattice (CML)models that combine a series of low-dimensional circle maps with a diffusion model have previously predicted qualitative features of wake patterns behind freely vibrating cables at low Reynolds number. However, the simple nature of the CML models implies that there will always be unmodelled wake dynamics if a detailed, quantitative comparison is made with laboratory or simulated wake flows. Motivated by a desire to develop an improved CML model, we incorporate self-learning features into a new CML that is trained to precisely estimate wake patterns from target numerical simulations and experimental wake flows. The eventual goal is to have the CML learn from a laboratory flow in real time. A real-time self-learning CML capable of estimating experimental wake patterns could serve as a wake model in a future anticipated feedback control system designed to produce desired wake patterns. A new convective-diffusive map that includes additional wake dynamics is developed. Two different self-learning CML models, each capable of precisely estimating complex wake patterns, have been developed by considering additional dynamics from the convective-diffusive map. The new self-learning CML models use adaptive estimation schemes which seek to precisely estimate target wake patterns from numerical simulations and experiments. In the first self-learning CML, the estimator scheme uses a multi-variable least-squares algorithm to adaptively vary the spanwise velocity distribution in order to minimize the state error (difference between modeled and target wake patterns). The second self-learning model uses radial basis function neural networks as online approximators of the unmodelled dynamics. Additional unmodelled dynamics not present in the first self-learning CML model are considered here. The estimator model uses a combination of a multi-variable normalized least squares scheme and a projection algorithm to adaptively vary the neural network weights. Studies of this approach are conducted using wake patterns from spectral element based NEKTAR simulations of freely vibrating cable wakes at low Reynolds numbers on the order of 100. It is shown that the self-learning models accurately and efficiently estimate the simulated wake patterns within several shedding cycles. Next, experimental wake patterns behind different configurations of rigid cylinders were obtained. The self-learning CML models were then used for off-line estimation of the stored wake patterns. With the eventual goal of incorporating low-order CML models into a wake pattern control system in mind, in a related study control terms were added to the simple CML model in order to drive the wake to the desired target pattern of shedding. Proportional, adaptive proportional and non-linear control techniques were developed and their control efficiencies compared

    Robust Watermarking in Multiresolution Walsh-Hadamard Transform

    Full text link
    In this paper, a newer version of Walsh-Hadamard Transform namely multiresolution Walsh-Hadamard Transform (MR-WHT) is proposed for images. Further, a robust watermarking scheme is proposed for copyright protection using MRWHT and singular value decomposition. The core idea of the proposed scheme is to decompose an image using MR-WHT and then middle singular values of high frequency sub-band at the coarsest and the finest level are modified with the singular values of the watermark. Finally, a reliable watermark extraction scheme is developed for the extraction of the watermark from the distorted image. The experimental results show better visual imperceptibility and resiliency of the proposed scheme against intentional or un-intentional variety of attacks.Comment: 6 Pages, 16 Figure, 2 Table

    Hybrid Fusion Based Interpretable Multimodal Emotion Recognition with Limited Labelled Data

    Full text link
    This paper proposes a multimodal emotion recognition system, VIsual Spoken Textual Additive Net (VISTA Net), to classify emotions reflected by multimodal input containing image, speech, and text into discrete classes. A new interpretability technique, K-Average Additive exPlanation (KAAP), has also been developed that identifies important visual, spoken, and textual features leading to predicting a particular emotion class. The VISTA Net fuses information from image, speech, and text modalities using a hybrid of early and late fusion. It automatically adjusts the weights of their intermediate outputs while computing the weighted average. The KAAP technique computes the contribution of each modality and corresponding features toward predicting a particular emotion class. To mitigate the insufficiency of multimodal emotion datasets labeled with discrete emotion classes, we have constructed a large-scale IIT-R MMEmoRec dataset consisting of images, corresponding speech and text, and emotion labels ('angry,' 'happy,' 'hate,' and 'sad'). The VISTA Net has resulted in 95.99\% emotion recognition accuracy on the IIT-R MMEmoRec dataset on using visual, audio, and textual modalities, outperforming when using any one or two modalities

    See Through the Fog: Curriculum Learning with Progressive Occlusion in Medical Imaging

    Full text link
    In recent years, deep learning models have revolutionized medical image interpretation, offering substantial improvements in diagnostic accuracy. However, these models often struggle with challenging images where critical features are partially or fully occluded, which is a common scenario in clinical practice. In this paper, we propose a novel curriculum learning-based approach to train deep learning models to handle occluded medical images effectively. Our method progressively introduces occlusion, starting from clear, unobstructed images and gradually moving to images with increasing occlusion levels. This ordered learning process, akin to human learning, allows the model to first grasp simple, discernable patterns and subsequently build upon this knowledge to understand more complicated, occluded scenarios. Furthermore, we present three novel occlusion synthesis methods, namely Wasserstein Curriculum Learning (WCL), Information Adaptive Learning (IAL), and Geodesic Curriculum Learning (GCL). Our extensive experiments on diverse medical image datasets demonstrate substantial improvements in model robustness and diagnostic accuracy over conventional training methodologies.Comment: 20 pages, 3 figures, 1 tabl
    • …
    corecore